Python - pandas loc vs iloc
Table of Contents
This article explains the features and differences between .loc() and .iloc(), which are necessary when handling DataFrames using the pandas library in Python.
First, for the explanation, we will use seaborn to fetch example data (iris).
>>> import seaborn as sns
>>> iris = sns.load_dataset('iris')
>>> iris.head()
1. loc #
loc is a method that selects data based on labels. It accesses data using the names (Labels) of rows and columns. In other words, it explicitly specifies the names of rows and columns to select data.
# Select values with 'virginica' in the column named 'species'
>>> iris.loc[iris['species'] == 'virginica']
2. iloc #
iloc is a method that selects data using integer-based indexes. It accesses data using the integer positions (indexes) of rows and columns. That is, it explicitly specifies the positions of the data in integers to select it.
# Select the data in the first row and second column
>>> iris.iloc[0, 1]
3.5
3. Differences between loc and iloc #
-
Type of Index:
Since loc uses labels, the names of rows and columns can be strings or other data types.
Since iloc uses integers, the indexes of rows and columns must be integers.
-
Usage:
loc focuses on selecting data using explicit labels.
iloc focuses on selecting data using integer positions (indexes).
-
Examples:
Example of loc: df.loc[‘A’, ‘column_name’]
Example of iloc: df.iloc[0, 1]
Which method to use depends on the structure of the DataFrame and the user’s purpose. loc is useful when labels are clearly defined, while iloc is useful when integer-based indexes are used.